Interactive plotting with holoviews#

Authors & Contributors#

Authors#

Contributors#

  • Alejandro Coca-Castro, The Alan Turing Institute (United Kingdom), @acocac

Overview

Questions
  • How to clip Xarray DataSet with a shapefile to plot an area of interest?
  • How to generate interactive maps with Holoviews?
  • How to browse multiple times with Holoviews interactive maps?
  • How to generate an interactive 1D plot with Holoviews?
Objectives
  • Learn about Holoviews and Xarray for creating interactive plots
  • Learn to create interactive maps with Holoviews
  • Learn to create multiple interactive plots with holoviews
  • Learn to create a 1D line plot with holoviews

Context#

We will be using holoviews with Xarray to visualize the Vegetation Condition Index (VCI) ([1] Kogan, 1995), a well-established indicator to estimate droughts from remote sensing data.

Data#

This episode we will use data that has been generated in the previous episode. You can also download it in zenodo: C_GLS_NDVI_20220101_20220701_Lombardia_S3_2_masked.nc

If the dataset is not present in the same folder than this Jupyter notebook, it is downloaded from zenodo.

import os.path
import wget

site_url = 'https://zenodo.org/record/6969999/files/C_GLS_NDVI_20220101_20220701_Lombardia_S3_2_masked.nc'

try:
    if os.path.exists(site_url.split('/')[-1]):
        print("Local file available")
    else:
        file_name = wget.download(site_url)
        print(file_name)
except:
    print("An error occured")
Local file available

Setup#

This episode uses the following Python packages:

  • xarray

  • rioxarray

  • netcdf4

  • h5netcdf

  • wget

  • numpy

  • matplotlib

  • cartopy

  • hvplot

  • geopandas

Please install these packages if not already available in your Python environment. Below, we only install packages that are not available in the EGI-ACE EOSC deployment of Pangeo for the FOSS4G course.

Packages#

In this episode, Python packages are imported when we start to use them. However, for best software practices, we recommend you to install and import all the necessary libraries at the top of your Jupyter notebook.

import xarray as xr

Open local dataset#

cgls_ds = xr.open_dataset('C_GLS_NDVI_20220101_20220701_Lombardia_S3_2_masked.nc')

Tip

If you get an error with the previous command, check the previous episode where the input file CAMS-PM2_5-20211222.netcdf is downlaoded locally and it is in the same directory as your Jupyter Notebook.

cgls_ds
<xarray.Dataset>
Dimensions:  (time: 20, lon: 984, lat: 612)
Coordinates:
  * time     (time) datetime64[ns] 2022-01-01 2022-01-11 ... 2022-07-11
  * lon      (lon) float64 8.502 8.505 8.508 8.511 ... 11.42 11.42 11.42 11.43
  * lat      (lat) float64 46.5 46.5 46.49 46.49 ... 44.69 44.69 44.68 44.68
Data variables:
    NDVI     (time, lat, lon) float64 ...

Clipping data according to a polygon#

one of the basic concept in GIS is to clip data using a vector geometry. Xarray isn’t directly capable of dealing with vectors but thanks to rioxarray can be easily achieved Rioxarray extends Xarray with all most of the plus that Rasterio (GDAL) brings.

Read a shapefile with the Area Of Interest (AOI)#

import geopandas as gpd

We define the area of interest using the Global Administrative Unit Layers GAUL G2015_2014 provided by FAO-UN (see Documentation). GeoPandas, a python-based library extending the capabilities of Pandas to deal with geometry and spatial operations, will help to manage geodata.

The official data distribution from FAO is through the WFS service (below how to retrieve data):

GAUL = gpd.read_file('https://data.apps.fao.org/map/gsrv/gsrv1/gaul/wfs?'
                     'service=WFS&version=2.0.0&'
                     'Request=GetFeature&'
                     'TypeNames=gaul:g2015_2014_2&'
                     'srsName=EPSG%3A4326&'
                     'maxFeatures=2500&'
                     'outputFormat=json')

Unfortunately seems that the service is pretty slow. As alternative to this approach the JRC MARS unit is distributing the original dataset that was in shapefile format. To faster the fetch Is strongly advise to use this approach.

For the training course, we also created a tiny file containing information about Italy only.

try:
    GAUL = gpd.read_file('Italy.geojson')
except:
    GAUL = gpd.read_file('zip+https://mars.jrc.ec.europa.eu/asap/files/gaul1_asap.zip') 

Data are organized in a tabular structure. For each element an index a data (made of columns) and a geometry is defined.

Geometry are defined through shapely geometry objects with three different basic classes:

  • Points and Multi-Points

  • Lines and Multi-Lines

  • Polygons and Multi-Polygons

GAUL.head(5)
asap1_id name1 name1_shr name0 asap0_id name0_shr km2_tot km2_crop km2_range an_crop an_range water_lim geometry
0 1172 Friuli-venezia Giulia Friuli Italy 172 Italy 7722 2995 1111 1 1 0 POLYGON ((12.79517 46.64600, 12.80945 46.63482...
1 1202 Calabria Calabria Italy 172 Italy 15203 7152 2127 1 1 1 POLYGON ((16.42261 40.14416, 16.43624 40.12897...
2 1204 Campania Campania Italy 172 Italy 13618 7376 2378 1 1 1 POLYGON ((14.16099 41.48058, 14.16653 41.48039...
3 1207 Abruzzi Abruzzi Italy 172 Italy 10797 4699 3300 1 1 1 POLYGON ((13.91523 42.89441, 13.92206 42.85505...
4 1211 Emilia-romagna Emilia-romagna Italy 172 Italy 22204 14958 771 1 1 1 POLYGON ((12.49151 43.91780, 12.48442 43.90761...

In the cell below, we subset the polygon geometry in which name1 field equals to Lombardia.

AOI_name = 'Lombardia'
AOI = GAUL[GAUL.name1 == AOI_name]
AOI_poly = AOI.geometry
AOI_poly
14    POLYGON ((10.23973 46.62177, 10.25084 46.61110...
Name: geometry, dtype: geometry

Second step that has to be defined is the reference system.

cgls_ds = cgls_ds.NDVI.rio.write_crs(4326)

Once this has been done we can clip data with the polygon that has been read through geopandas at the beginning of the notebook.

NDVI_AOI = cgls_ds.rio.clip(AOI_poly, crs=4326)

Visualize with matplotlib#

import matplotlib.pyplot as plt
import cartopy.crs as ccrs
fig = plt.figure(1, figsize=[20, 10])

# We're using cartopy and are plotting in PlateCarree projection 
# (see documentation on cartopy)
ax = plt.subplot(1, 1, 1, projection=ccrs.PlateCarree())
#ax.set_extent([15.5, 27.5, 36, 41], crs=ccrs.PlateCarree()) # lon1 lon2 lat1 lat2
ax.coastlines(resolution='10m')
ax.gridlines(draw_labels=True)

NDVI_AOI.sel(time='2022-06-01').plot(ax=ax, transform=ccrs.PlateCarree(), cmap="RdYlGn")

# One way to customize your title
plt.title("S3 NDVI over Lombardia", fontsize=18)
Text(0.5, 1.0, 'S3 NDVI over Lombardia')
/usr/share/miniconda/envs/foss4g/lib/python3.10/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
../_images/visualization_26_2.png

Visualization with holoviews#

import holoviews as hv
import hvplot.xarray
import hvplot.pandas

Plotting data through the HoloViews back-end thanks to the hvplot that acts as high-level plotting API

NDVI_AOI.isel(time=0).hvplot(cmap="RdYlGn", width=1000, height=1000)

Having a look to data distribution can reveal a lot about data

NDVI_AOI[0].hvplot.hist(cmap="RdYlGn",bins=25, width=800, height=700)

Multi-plots using groupby#

To be able to visualize interactively all the different available times, we can use groupby time.

NDVI_AOI.hvplot(groupby ='time', cmap="RdYlGn", width=800, height=700)

Once data is limited to the AOI a histogram can be added to the visualization.

NDVI_AOI.hvplot(groupby='time', cmap='RdYlGn', width=800, height=700 ).hist()

Plot a single point over the time dimension#

NDVI_AOI.sel(lat=45.88, lon=8.63, method='nearest').hvplot(ylim=(-0.08, 0.92))
Key Points
  • rioarray for clipping over an area of interest
  • Interactive plot with Holoviews and Xarray